A Similarity-based Approach to Match Elements Across Versions of XML Documents
نویسندگان
چکیده
XML documents are often used to provide inter-system interoperability. A related problem is that XML documents evolve over time, so identifying and understanding the changes they undergo become crucial. Some diff approaches based on syntactic and semantic analysis of the documents have been developed to address this problem. The strategy is to find data fragments that are identical in both versions of an XML document and match the corresponding elements through the use of context keys. However, depending on how XML documents are managed, there is no guarantee that the values of these keys remain the same across versions. Thus, differently from existing approaches, this paper proposes the use of similarity to match corresponding elements across XML versions, rather than key equality. It also shows how this can be applied to support both syntactic and semantic XML diff applications.
منابع مشابه
خوشهبندی فراابتکاری اسناد فارسی اِکساِماِل مبتنی بر شباهت ساختاری و محتوایی
Due to the increasing number of documents, XML, effectively organize these documents in order to retrieve useful information from them is essential. A possible solution is performed on the clustering of XML documents in order to discover knowledge. Clustering XML documents is a key issue of how to measure the similarity between XML documents. Conventional clustering of text documents using a do...
متن کاملSemantic and Structure Based XML Similarity: The XS3 Prototype
Due to the ever-increasing web availability of XML-based data, an efficient approach to compare XML documents becomes crucial in information retrieval. Such comparison of XML documents has applications in version control (finding, scoring and browsing changes between different versions of a document), change management and data warehousing (support of temporal queries and index maintenance) [3,...
متن کاملIdentifying Structural Mapping between XML Fragments
With the popularity of XML for representing & exchanging data, the requirement for agreement between parties on common Schema or DTD has become significant. Getting various parties to agree on a common standard is often time consuming and complex problem. This work suggests an approach to identify mapping between XML documents with different schemas. Introduction Identifying mapping between two...
متن کاملPrototyping a Vibrato-Aware Query-By-Humming (QBH) Music Information Retrieval System for Mobile Communication Devices: Case of Chromatic Harmonica
Background and Aim: The current research aims at prototyping query-by-humming music information retrieval systems for smart phones. Methods: This multi-method research follows simulation technique from mixed models of the operations research methodology, and the documentary research method, simultaneously. Two chromatic harmonica albums comprised the research population. To achieve the purpose ...
متن کاملSimilarity of XML-Schema Elements: A Structural and Information Content Approach
EXtensible Markup Language (XML)-Schemas are the emerging standards for describing and validating semi-structured documents across the Internet, due to the rich set of modeling constructors, types and constraints they provide. Semantic similarity is growing in importance in different settings, such as digital libraries, heterogeneous databases and, in particular, the Semantic Web. The focus of ...
متن کامل